AITopics | vector distribution

Collaborating Authors

vector distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Curator: Efficient Indexing for Multi-Tenant Vector Databases

Jin, Yicheng, Wu, Yongji, Hu, Wenjun, Maggs, Bruce M., Zhang, Xiao, Zhuo, Danyang

arXiv.org Artificial IntelligenceJan-13-2024

Vector databases have emerged as key enablers for bridging intelligent applications with unstructured data, providing generic search and management support for embedding vectors extracted from the raw unstructured data. As multiple data users can share the same database infrastructure, multi-tenancy support for vector databases is increasingly desirable. This hinges on an efficient filtered search operation, i.e., only querying the vectors accessible to a particular tenant. Multi-tenancy in vector databases is currently achieved by building either a single, shared index among all tenants, or a per-tenant index. The former optimizes for memory efficiency at the expense of search performance, while the latter does the opposite. Instead, this paper presents Curator, an in-memory vector index design tailored for multi-tenant queries that simultaneously achieves the two conflicting goals, low memory overhead and high performance for queries, vector insertion, and deletion. Curator indexes each tenant's vectors with a tenant-specific clustering tree and encodes these trees compactly as sub-trees of a shared clustering tree. Each tenant's clustering tree adapts dynamically to its unique vector distribution, while maintaining a low per-tenant memory footprint. Our evaluation, based on two widely used data sets, confirms that Curator delivers search performance on par with per-tenant indexing, while maintaining memory consumption at the same level as metadata filtering on a single, shared index.

shortlist, tenant, vector, (17 more...)

arXiv.org Artificial Intelligence

2401.07119

Country:

North America > United States > California > Alameda County > Oakland (0.04)
Europe > Poland (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology (1.00)
Media (0.67)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Optimal transport for vector Gaussian mixture models

Zhu, Jiening, Xu, Kaiming, Tannenbaum, Allen

arXiv.org Machine LearningDec-16-2020

Finite mixture models can describe a wide range of statistical phenomena. They have been successfully applied to numerous fields including biology, economics, engineering, and social sciences [15]. The first major use and analysis of mixture models is perhaps due to the mathematician and biostatistician Karl Pearson over 120 years ago who explicitly decomposed a distribution into two normal distributions for characterizing non-normal attributes of forehead to body length ratios in female shore crab populations [16]. The literature on analyzing and applying mixture models is growing due to their simplicity, versatility and flexibility. One of the most commonly used mixture models is the Gaussian mixture model (GMM), which is a weighted sum of Gaussian distributions.

gaussian, interpolation, vector distribution, (16 more...)

arXiv.org Machine Learning

2012.09226

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Learning via Gaussian Herding

Crammer, Koby, Lee, Daniel D.

Neural Information Processing SystemsDec-31-2010

We introduce a new family of online learning algorithms based upon constraining the velocity flow over a distribution of weight vectors. In particular, we show how to effectively herd a Gaussian weight vector distribution by trading off velocity constraints with a loss function. By uniformly bounding this loss function, we demonstrate how to solve the resulting optimization analytically. We compare the resulting algorithms on a variety of real world datasets, and demonstrate how these algorithms achieve state-of-the-art robust performance, especially with high label noise in the training data.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Industry: Education > Educational Setting > Online (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback